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Abstract — In models of social learning where rational agents 
can observe other agents’ actions, information cascades are said 
to occur when agents ignore their own private information and 
blindly follow the actions of other agents. It is well known that in 
some cases, incorrect cascades happen with positive probability 
leading to a loss in social welfare. Having agents provide reviews 
in addition to their actions provides one possible way to avoid 
such “bad cascades.” In this paper, we study one such model 
where agents sequentially decide whether or not to purchase a 
good, whose true value is either good or bad. If they purchase 
the good, agents also leave a review, which may be noisy. 
Conditioning on the underlying state of the world, we study 
the impact of such reviews on the asymptotic properties of 
cascades. For a good underlying state, we propose an algorithm 
that utilizes number theory and Markov analysis to solve for 
the probability of wrong cascade. We discover that depending 
on the review quality, reviews may change the probability of a 
wrong cascade in a non-monotonic manner. On the other hand, 
for a bad underlying state, the agents always eventually reach 
a correct cascade; we use a martingale analysis to bound the 
time until this happens. 

I. Introduction 

On-line platforms provide an easy way for people to 
attempt to learn from others before making a new decision. 
Such “social learning” has long been studied by economists 
as a game among Bayesian agents. In the simplest setting, 
these agents sequentially make a binary decision based on 
their own beliefs, which are in turn a function of their own 
private information as well as observations of the decisions 
of previous agents. A key result, first shown in [2] and [3], 
is that such models exhibit herding or information cascades. 
This refers to a case where some agents ignore their private 
information and follow the actions of the previous agents. 
Moreover, for the models in [2] and [3], once a cascade starts, 
all subsequent agents also cascade. Though individually 
optimal, this may result in the agents making a choice that is 
not socially optimal, which we refer to as a “wrong cascade.” 

A wrong cascade occurs because agents observe the ac- 
tions of other agents before the other agents receive their 
pay-offs, and so these actions reflect the agents’ estimates 
of the true pay-off and not the true pay-off itself. Indeed, if 
agents instead were able to see the true pay-off obtained by 
others, then as shown in [9] there would never be an incorrect 
cascade in which agents buy a bad product. The use of 
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reviews and on-line recommendation systems can be viewed 
as an attempt to provide other agents with this information. 
However, due for example to user errors, such reviews may 
only be a noisy representation of this information (instead 
of the true pay-off as in [9]). 

The goal of this paper is to study social learning in the 
presence of such noisy reviews. More precisely, we consider 
a variation of the models in [2], [3], where agents have the 
option to either buy or not buy a given item, whose true 
value is one of two binary states (good or bad). In addition to 
the actions of the previous agents, agents also see a history 
of reviews before making their decisions. However, these 
reviews are not a perfect indication of the true state of the 
good due to two effects: first, as we have already mentioned, 
these reviews are noisy, and second, agents can only leave a 
review if they buy the good and so no additional information 
is given for agents that choose not to buy. 1 

Adding reviews is a way of changing the information 
structure in [2], [3]. A number of other ways have been 
considered that also change this structure such as changing 
the underlying network structure among the agents, e.g. [8], 
or changing the signal structure, e.g. [6]. In prior work ([10], 
[11]), we considered a variation of the information structure, 
where agent observed noisy observations of the actions of 
others. This led to the following counter-intuitive result: the 
probability of incorrect herding is non-monotonic in the noise 
level. In other words, in some cases, more noise is actually 
beneficial. In this paper, we again seek to study how varia- 
tions in the noise level affect the agents’ behaviors. However, 
here, agents perfectly observe the actions of previous agents 
and the only noise is in the reviews. Additionally, since only 
agents who buy the good can submit reviews, this leads to an 
asymmetry in the model that was not present in [10], [11]). 

We presented an initial analysis of this model in [12], 
There it was shown that the asymmetry in reviewing leads 
to an asymmetry in the resulting users’ behaviors depending 
on the underlying state of the product being either good or 
bad. Here we present a more refined analysis of these two 
cases. Conditioned on the state of the product being good, 
we study the probability of an incorrect cascade. We give 
an algorithm based on a number theoretic arguments that 
enables characterizing this probability for a much larger set 
of parameter settings. Using this we then can characterize the 
behavior of the wrong cascade probability as a function of 
the noise level and show that this is a highly non-monotonic 

'For example, many on-line platforms such as Amazon.com indicate 
verified purchase reviews: in our mode only such reviews are considered. 



and discontinuous function, so that in some cases decreasing 
the reviews noise can lead to a higher probability of a wrong 
cascade. Conditioned on the state being bad, we instead focus 
on the expected time until a correct cascade occurs. Using 
martingale techniques, we give bounds on the expected time 
until a correct cascade happens. We compare these bounds 
with simulations and offer an algorithm to improve the lower 
bound numerically. 

Another strand of related work is the literature on “word- 
of-mouth” learning (e.g. [4], [5], [7]) in which agents can 
communicate information about payoff of past actions. How- 
ever, these models consider different settings (e.g. naive rule- 
of-thumb decision-based, random sampling of population); 
while our paper assumes that fully-rational agents can ob- 
serve all past actions and reviews. 

We organize this paper as follows. In Section II we specify 
our model. The main results are presented in sections III and 
IV for the cases where the value of product is “good” and 
“bad,” respectively. We conclude in Section V. 

II. Model 

We consider a model similar to [12] in which there is a 
countable population of agents, indexed n = 1,2,... with 
the index reflecting the time and the order in which agents 
act (given exogenously). There is a new product (item) with 
a true value (V ) that can be either good (G) or bad (B); both 
possibilities are assumed to be equally likely and the value is 
the same for all agents. Each agent n has a one-time action 
choice A n of saying either “Yes” (Y ) or “No” (TV) to this 
item. Assume each agent n has prior knowledge about the 
true value V via a private signal S n £ {1 (high), 0 (low)}. 
For each agent n who chooses A n = Y submits a review 
R n £ {G (Good), B (Bad)} representing his experience with 
the item after purchasing. On the other hand, if agent n 
chooses A n = N, he does not submit a review. 

We consider a homogeneous population where, condi- 
tioned on V, the private signals and reviews are i.i.d. across 
all agents. Assume the probability that a private signal (resp. 
a review) aligns with U is p £ (0.5, 1) (resp. 6 £ [0.5, 1]), 
i.e., the distributions of the signals and reviews are given as: 

P [S n = 1\V = B] = P[5„ = 0\V = G\ = 1-P, 

P[S„ = 1\V = G}= P[S n = 0|U = B]=p, and if A n = Y, 
P [R n = G\V = G\ = P [R n = B\V = B] = S, 

P [R n = G\V = B]= P [R n = B\V = G\ = l-8. 

Since p £ (0.5,1), the private signals are informative, 
but not revealing; we call p the signal quality. On the other 
hand, 5 denotes the review’s strength. The review and the 
private signal are assumed to be conditionally independent 
given V. 2 Let TZ„ = R n when A„ = Y and lZ n =* 
when A n = N. The history after agent n decides is 
written as H n = {Ai, IZi, . . . , A n , 1Z n }', we assume that 
H n is public information to subsequent agents. The agents 

2 The motivation being, while signal quality reflects a product's marketing 

efficiency, the review strength is a consequence of product reliability, e.g., 

due to manufacturing. 


are Bayes-rational whose decisions are based on their own 
private signals and public information. Each agent n updates 
his posterior belief about the true value V using his pri- 
vate signal S n . the actions A 1 ,...,A n _ 1) and the reviews 
R-1, ■ ■ ■ , Rn- 1- 3 

A. Public likelihood ratio as a Markov process 

Let q = 1 — p. Agents’ decisions are based on Bayes’ 
updates of the posterior probability of V = B versus 

V = G given the observed history H n . However, due to 
the independence of signals from the public history, agent 
n + 1 can instead compare the public likelihood ratio, £ n , 
and his private belief /3„+i, of V = B versus V = G. Since 

V being { B . G } equally likely, (q : 1 and we can rewrite 
£„ in its alternate form: 


¥[H n \V = B\ 
¥[H n \V = G\ ’ 


and /3„+i 


P[S n+1 \V = B] 
P[S n+1 \V = GY 


The higher £ n is, the more likely that V = B. Moreover, 
since II „ is public information, £ n can be updated as: 

• If agent n follows his own signal then: 


r §4-t, if A n = N, 

if A n = Y,U n =G, (2) 
Ut=*4*-1i if A n =Y,n n =B. 

• Otherwise, if agent n cascades then: 

(e n -i, if A n =N, 

4 = < ^L-1, if A n = Y, 7Z n = G, (3) 
[ ^4-1, if A n = Y , Tl n = B. 


Moreover, as shown in the next lemma, given £ n one can 
determine if an agent cascades or not. Thus, { £ n } is a Markov 
process. Moreover, this is also true if in addition we condition 
on each value of V 4 . On the other hand, /3„+i = q/p (resp. 
p/q) if S n+ i = 1 (resp. Sn +1 = 0). 


B. Agents’ decision rule and cascades’ condition 

Let a n and r„ be two integer random variables denoting 
the two differences in actions (#Y — #TV) and reviews 
(#G — #H), respectively. Note that while a n excludes the 
actions caused by both types of cascades (since then the 
cascading actions provide no information), r n is unchanged 
whenever a review is not made due to an agent not buying. 

Lemma 1. Define x = \og_s_(p/q) £ (0, oo) for 6 £ 
(0.5, 1). Then: 

1) t n = ( q/p) hn , where the exponent h n = a n + \r n , 

2) Conditioned on V, ( a n ,r n ) and h n are 2-D and 1-D 
Markov chains, respectively, for n > 0, 

2) Agent n + 1 cascades Y if h n > 1, cascades N if 
h n < —1, and follows his signal if h n £ [—1, 1] . 

Proof. 1) By (2) and (3), £ n = ( q/p ) a " ((1 — S)/S) rrl , thus 
h n can be written in terms of a n and r n as above. 


3 For simplicity, we assume indifferent agents follow their own signals. 
4 This is an extension of results from [6], 



2) This is a direct consequence of the fact that {£ n } is 
a Markov process and that, from the first property, there 
is a 1-1 correspondence between £ n and h n . Further, since 
a n and r n are integer-valued it follows that h n only takes 
on a countable number of values. Without reviews a similar 
Markov chain was used in [10]. 

3) Since agent n + 1 makes his decision by comparing 

InPn+i to 1, agent n+ 1 cascades Y if £ n < q/p, cascades 
N if £ n > p/q , and follows his signal if £ n £ [q/p,p/q\. By 
1), this is translated to the given condition on h n . □ 

Note that x is an indicator of how strong the reviews 
are with respect to the signals. That is, the lower x is, 
the stronger the reviews are relative to the signals. For a 
generic x, the dynamics of the process {£ n } can be studied 
by investigating the 2-D Markov chain (a n , r n ). However, for 
special values of x, this can be simplified to certain extents. 
We will study two of such scenarios in Section III. 

C. Asymmetry by different types of cascade and item quality 

This model exhibits asymmetric behaviors with respect 
both to the types of cascades ( Y and N), and to the true value 
V of the item. In particular, the arrival of new information 
(reviews) depends on the action chosen by each agent. We 
first highlight a key difference between Y and N cascades 
in the following two properties. 

Property 1. Once a Y cascade starts, there is a positive 
probability that it ends ( unless the review are perfect). 

If agent n faces h n - 1 > 1, he chooses A n = Y regardless 
of his signal, and thus initiates a Y cascade. Such a cascade 
can end if subsequent agents submit a sufficient number of 
bad reviews, e.g., if 7 Z n = B, then h n = h n -\ — - could 
be below 1, which induces agent n + 1 to use his signal. 
Furthermore, if x is sufficiently small then agent n’s bad 
review can make h n < —1, so that agent n + 1 starts a N 
cascade. The dynamics of a Y cascade, once it gets started, 
are determined solely by the reviews process (and it does not 
depend on the signals). Regardless of the time a Y cascade 
was initiated, it can be broken by a sufficiently long sequence 
of bad reviews. Thus, the history process {H n } could include 
sample paths where Y cascades start and stop multiple times. 

On the other hand, once h n < — 1, a N cascade starts, it 
lasts forever. This is because agents who choose N do not 
generate reviews; thus, the likelihood ratio stays constant as 
soon as any agent cascades to N. Subsequent agents are 
left in the same state as the one who initiated the cascade; 
thus make the same action choice. We summarize this in the 
following Property 2. 

Property 2. Once a N cascade happens, it lasts forever. 

Next we give two properties that show the differences 
between a good and a bad product. 

Property 3. For V = G, a wrong cascade happens with 
positive probability 

This comes as a result of the existence of the absorbing 
states for a wrong cascade. For example, if the first two 


agents have low signals, they both choose N; therefore no 
review is collected. As a result, all subsequent agents are 
drawn into a N cascade, which is irreversible. This possibil- 
ity cannot be avoided by adjusting the reviews strength, 6, 
even to perfect quality. In case the reviews are perfect, we 
would still need a non-cascading agent who has a H signal 
for his review to be submitted. 

Though a wrong cascade is possible, for V = G, it is 
more likely that there will be an abundance of information, 
since each agent that chooses Y also creates a new review. 
Since reviews are independent of signal, when V = G more 
agents choose Y, and new information begets further new 
information. In other words, when V = G the underlying 
Markov process has a drift toward the correct cascade, but 
there is no absorbing state on that side since h n is unbounded 
above. However, multiple absorbing states for wrong cascade 
might exist. For V = G, the quantity of interest is the 
probability of wrong (TV) cascade, which is a function of 
both p and S. We will discuss this scenario in section III. 

On the other hand, when V = B, this model exhibits 
a different set of behaviors. As more agents purchase the 
item, more and more reviews are collected. Since reviews are 
informative, subsequent agents can track the difference in the 
number of reviews to learn the true value of V eventually. In 
other words, while there are only trapping states for correct 
cascade, the drift also leans toward this side. We summarize 
this result below. 

Property 4. For V = B and 6 > 0.5, a wrong cascade can 
never happen } 

Thus, for V = B correct cascade happens with probability 
1. In this scenario, we are interested in the distribution of 
the time (i.e. the number of agents) until a correct cascade 
happens. This will be studied in section IV. 

III. Probability of wrong cascade for V = G 

In previous section, we discussed that wrong (TV) cascades 
could happen if the product is good. In this section, we 
determine the probability of this happening. For a fixed p, as 
x varies the conditions on a n and r n when cascades happen 
also change. As a result, the underlying Markov chains 
have different structures (both in terms of the state spaces 
and the transition probabilities). Despite the complexity of 
these dynamics for a generic x, interesting and non-intuitive 
insights can be drawn by looking at special values of x. In 
one example, for any rational x, many states of ( a n ,r n ) can 
be mapped to one single state of (h n )\ thus it is sufficient to 
study the reduced 1-D Markov chain ( h n ) . This is generally 
not possible for any real value of x since there would be a 
1-1 mapping between the states in ( a n ,r n ) and (h n )\ thus 
this prevents the simplification of the state space. However, 
in another example when x is real and x < 1/3, the state 
space of (a n ,r n ) can also be simplified to obtain analytical 
results. In particular, we consider two scenarios that facilitate 
simplification of the underlying state space of ( a n ,r n ): 

'Note if S = 0.5, then reviews are useless, in which case wrong cascades 
can occur as in [2]. 



p=0.70, n=100, 10000 samples 


1) £ is a rational number in (0, oo), and 

2) x is any real number in (0, 1/3]. 

A. x = i/ j for positive integers i, j and gcd(i,j) = 1 

From the discussions at the beginning of this section, it 
is sufficient in this case to consider the 1-D Markov chain 
(h n ). Let P s be the asymptotic probability of wrong cascade 
starting from the state ho = s. We want to find Pq. Given 
i, consider the finite set si = {—1, — . . . , 1}. It is 

obvious that srf is the set of all possible values that h n = 
+ can take in [—1,1], Depending on the value of x, 
the following Lemma 2 further reduces the set of accessible 
states for h n G [—1,1] to different subsets of si. 

Lemma 2. Assume x = i/j is rational, where i,j are 
positive integers with gcd(i,j ) = 1: 

1) Ifx< 1/3, or if x£ {1/2,1}, h n G {-1,0,1}, 

2) If 1/3 < x <1/2, let z = j mod i and k = [ i/z\ Then 

l f -| rj i z_ 2z_ kz i—z i—2z i—kz 

'hi ^ \ A » i)* 1 *) i ’ i ’ i >**•> i h 

3) If x > 1/2, h n takes all the values in si. 

Proof Idea. The proof uses number theoretic arguments to 
find what values in si can be obtained (see Appendix). □ 
As a consequence of Lemma 2, we can numerically solve 
for Pq. The idea is based on Markov analysis where one can 
write down a system of of linear equations (LEs) with the 
set of variables being P ^ for all accessible states h n . Since 
there is no absorbing state for a Y cascade, h n is not upper- 
bounded and the accessible state space is infinite. However, 
once h n > 1 the state transitions dynamics are simplified to 
a birth-death process; thus any variable P/,. n where h n > 1 
can be expressed in terms of the corresponding variables 
where h n G si . Therefore, the number of equations is finite 
(at most 2 i + 1). We propose the following Algorithm 1 to 
construct the system of linear equations and solve for Po: 

Algorithm 1 Wrong cascade probability at rational x 

Input: V = 1, p, x = i/j, gcd(i,j) = 1 
Output: LEs and solution Po 

6 G- 1/ (l + (■ q/p ) x t x ), q <— 1 — p, a <— (1 — 5)/5 
Initialize si g- {—1, — (i — 1 )/i, — l)/i, 1} 

si 3 si ' G- accessible states in [—1, 1] (Lemma 2). 
for h n = s G si' do 

s L <- s-l,s HB «- s + l-j/i, s H g <- s + 1 + j/i 
Ci G- min number of steps from shg to Si G si' . 

if srb > 1 then 

C 2 G- min number of steps from shb to S 2 G si' 
Eq s g- P s = qP SL +p6a Cl P Sl + p( 1 - S)a C2 P S2 

Add equation Eq s to the system of LEs 
Solve for Pq and return. 


Using Algorithm 1, numerical values for the probability 
of wrong cascade can be solved for, given x rational. Fig. 1 
compares those numerical values to simulation results. 



Figure 1: Wrong cascade probability for V = 1. 

Both numerical and simulation results in Fig. 1 show 
that the probability of wrong cascade is not monotonic 
in the review quality <5. As 6 varies in [0.5,1], there are 
points of discontinuities resulting from the changes in the 
state space and the transition probabilities of the underlying 
Markov chains (h n ) For low review quality, simulation fails 
to evaluate Po since the mean number of agents, n, needed 
until a N cascade happens approaches infinity as 6 S 0.5. 

As a consequence of Lemma 2, for certain values of x 
the state space is simplified enough and we can obtain the 
closed-form expressions for the wrong cascade probability. 
In particular. Proposition 1 in [12] showed that for x = 1 and 
x = 1/2, Po = (q/p) 2 . Moreover, when there are no reviews, 
a result from [2] gives P 0 = (q/p) 2 / [(q/p) 2 + l] < (q/p) 2 . 
Thus, having reviews with strength equal or double the signal 
quality strictly increases the probability of wrong cascades. 
Further from Fig. 1, this is true for any 1/2 < x < 1. 

B. x is any real value in (0, 1/3]. 

In this section, we present the second scenario when the 
state space (a n ,r n ) can also be simplified. In particular, 
we look at the cases when reviews are at least three times 
stronger than the private signals. For any real value of x 
in this region, the state transitions of the underlying 2-D 
Markov chains are shown in Fig. 2, where the first and 
second coordinates denote r n and a„ , respectively. By Propo- 
sition 2 in [12], we can obtain the closed-form expression 
for the probability of wrong cascade: 

Po = [1 - P(26 - 2 pS + 2 p- p/5)] / [1 - 2pq(\ - J)] (4) 

which is decreasing in 5. This is illustrated in Fig. 3 for x = 
1/5, 1/10. For all values of x in this figure, the probability of 
wrong cascade decreases in the signal quality p. Moreover, 
except for reviews with perfect accuracy, one would prefer 
having no reviews for enough low signal quality. 




Figure 3: Wrong cascade probability for V = G. 

The above conclusions can be explained by the disconti- 
nuity of the slopes for different curves in the above figure 
as p — t 0.5. With perfect reviews, the slope as p — ► 0.5 
is —1. With no review, the corresponding slope is at —2. 
However, as long as the platform generates reviews with 
strength S bounded away from 0.5, the probabilities of wrong 
cascade follow the set of dashed curves shown, with slopes 
bounded away from —2. In particular, these slopes can be 
studied using (4) by setting p = 0.5 + e and x = Ce where 
e — > 0, C > 0. When C is fixed, S is bounded away from 
0.5; this yields a slope of — oo as e — > 0. When C — t oo 
and x < 1/3, <5 — t 0.5; and the slopes vary in (— oo, —8). 
Finally, if C — »• 0, S — > 1; the slopes approach —1, which is 
exactly the slope of the perfect reviews scenario. 

IV. Time until correct cascade for V = B 

In section II, we argued that for a bad product, only a 
correct (N) cascade can happen, so that it lasts forever once it 
occurs. In this section, we examine both the upper and lower 
bounds on the expected time until correct cascades. In the 
following let n > 0. Conditioned on V £ {G, Bj, let } 
be the sequence of a - algebras generated by { //„ } . Similar 
to [6] and [8] where reviews do not exist, in our model the 
Markov process { l n } also exhibits the martingale property. 
In Section IV of [12] we showed that {1/G,} (resp. {/„}) is 
a martingale process conditioned on V = B (resp. V = G) 
adapted to the filtration (resp. {^n})- Moreover, 

let X and Y be two random variables representing the 
increments A h n = h n + 1 — h n for h n in [—1,1] and h n > 1, 
respectively. Let /i(A) and / 2 (A) be their corresponding 
moment generating functions (MGFs), where A is a real 
variable. Let p = max(/i(A), / 2 (A)) and define the random 
process {M n } = Using techniques from [1], in 

[12] we showed that {M n } is a super-martingale adapted to 
Uet r = min {n > 0 : h n < —1} be the stopping 
time when a N cascade happens. Now we use these results 
to bound the expected time until correct cascade, E[r]. 

A. Upper bound on E[r] 

Proposition 1. E[r] < e A /(l — p), where 0 < p < 1, A £ 

(0, ln(p/ (1 — p)). 

Proof. From Proposition 3 in [12], the tail distribution is 
upper-bounded by: P[r > n] < e x p n . For feasibility, we 


require 0 <P< 1. thus A £ (0, ln(p/(l —p)). Now, since r 
is a positive integer random variable, we can write: 

P[T>n] <e x /{l-p) 

z — J n=0 

□ 

The above bound is a function of the dummy variable A, 
and the two MGFs /i,/ 2 - Our objective is to find A and 
p that minimize this bound. We solve this numerically and 
compare the minimum bound with the mean time obtained 
by Monte-Carlo simulations for different values of p and <5. 

B. Lower bound on E[r] 

Let p = 1/ max(/i(A), / 2 (A)) for regions where 0<p< 
1, i.e. A £ (ln(p/(l — p)),oo). The following Proposition 
provides a lower bound on E[r]. 

Proposition 2. E[r] > e~ x p[ 1 — A]/[Mi(l — p)\, where 
0 < p < 1, A £ (ln(p/(— p)), 00 ), 

A = p 4 + (p - p 4 )Vi + ( p 2 - p 4 )V 2 + ( p 3 - p 4 )V 3 , and 
V n = P[r = n\h\] for n = 1,2, 3. 

Proof Idea. The proof uses the super-martingale property of 
{M n } and total probability theorem using the three possible 
values of h\. Conditioned on each hi, we calculate the 
probabilities of r taking the hrst three values n = 1,2,3. 
We then use these probabilities to provide a lower bound on 
E[t]. See Appendix for details. □ 

Note that the above lower bound is then numerically 
maximized over A £ (ln(p/(l — p), 00 ). Moreover, due 
to computational constraints, the bound in Proposition 2 
is obtained using the closed-form expressions of V n for 
n = 1,2,3. Next, we present an algorithm that improves 
this lower bound by numerically calculating V n for higher 
values of n. 


Algorithm 2 Finding P[r = n\hi] 

Input: V = 0, p, S, n 
Output: V n = P[t = n\hi\ 

idea: Build a breadth-first tree conditioned on hi, 
tree.add(root), qualified = empty list of qualified nodes 
while tree.notemptyO do 

Pick the hrst node j at lowest level i by BFS 
Check for early elimination, e.g. hj > 1 + (n — i)/x 
if i < n (not a leaf) then check for hj > —1 
if True then tree.add(j’s children) 

update condition on node j’s children 
else(leaf) check for hj < — 1 
if True then qualified. add(j) 
tree, remove/;/) 

Update V n using the qualified list of leaves, return V„ 


C. Numerical and Simulation results 

In Fig. 4 below, we use Algorithm 2 to show how the 
lower bound can be improved as n is increased. Conditioned 
on each hi, computational constraints limit us to using at 


most n = 17, which generates approximately 10 5 possible 
realizations of the history that would lead to an N cascade. 
The algorithm offers more improvement for lower values of 
S. The non-monotonicities and discontinuities of the bound 
are a consequence of the same behaviors of each V n . 



Figure 4: Bounds of log(E[r]) versus simulation. 

Fig. 4 also shows the numerical bounds as compared with 
simulations on log-scale. Simulation results showed that E[r] 
is decreasing as S increases. As 6 — > 1, the lower bound 
offers a better approximation to simulations; both converge 
to the same value of 1 + p. On the other hand as 6 -£ 0.5, 
both the upper bound and the simulation results blow up, 
while the lower bound offers less information. 

In fact, we can verify that as <5 — >■ 0.5, we have E[r] — » oo. 
For any S £ [0.5, 1], there is a positive probability of the 
underlying Markov chain hitting a state in the region where 
h n > 1. In this region, the process becomes a simple birth- 
death process with transition probabilities <5, 1 — <5 to the left 
and right, respectively. By relabeling the states, assume that 
we start at a state i > 0 in this birth-death process where 0 
is an absorbing state on the left and there is no absorbing 
state on the right. Let T t = min{n > 0 : h n = 0\ho = i} 
be the stopping time when the absorbing state is 0. For this 
birth-death process, the recurrence equation is written as: 
(1 — 5)E[rj+i] — E[tj] + <$E[t»_i] = —1. This generates a 

general solution of the form E[r,] = A ( 135 ) + B + 2 ^_ 1 , 
where A, B are real constants. Using the boundary condition 
E[to] = 0, we have A = —B. Moreover, since E[ n] > 0 , 
we require A > 0. Now assume that <5 = 0.5 + e where we 
let e -> 0. As a result, 26 — 1 = 2e — > 0 and E[r,;] — > 00 . 
But since E[r,] gives a lower bound on the original E[r], we 
also have E[r] — > 00 as 6 — > 0.5. 

V. Conclusions and future work 

This paper studied a Bayesian learning model with infor- 
mation cascades. We assumed that subsequent agents can 
observe perfectly the previous actions and, in addition, feed- 
back in the form of noisy reviews depending on the actions. 
We showed that the probabilities that agents cascade toward 
the wrong actions are not monotonic in the reviews quality. 
In particular, noisy reviews could increase the probability 
that agents misinterpret the true value of a good product. 
In practice, in online platforms like Yelp, Amazon, etc. 


customers reviews come with a variability of strengths. Even 
though this scenario was not considered in this paper and 
our previous work, our results indirectly implied that a 
platform planner should opt to cut out the reviews of bad 
qualities and release only the truthful ones. In fact, this 
strategy is already adopted by those platforms, e.g. Amazon 
with verified purchase reviews, or Yelp with filtered reviews. 
Moreover, our results suggested that no matter how strong 
the reviews are improved to, agents might not perform better 
if their prior knowledge are limited. This implies that a 
platform planner should consider spending their budget on 
improving both the product’s marketing efficiency and the 
reviews’ reliability. 

In the future work, we plan to study the possibility of hav- 
ing reviews with strengths non-homogeneously distributed 
across the population. In addition, we would like to study the 
probability of wrong cascades for more generic relationships 
between the signals quality and the reviews strength. Other 
possible directions include considering having reviews when 
both type of actions are taken, letting agents have the option 
to leave the reviews, and assuming that not all agents would 
exercise this option. 
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Appendix 

A. Proof sketch of Lemma 1 

For simplicity of notation, the common denominator i 
of the accessible states is implicitly understood and we 
only consider the dynamics of the numerators. Therefore we 
reformulate sY = {— i, + with the initial 

state hg = 0. If h n £ [— i, z], the next state is h n + X where 
X £ {-i, i + j,i — j}. If h n > i, the next state is h n + Y 
where Y £ {± — j}. If h n < —i, h n is absorbing where N 
cascade happens and the next state is also h n . 

1) For j > 3 i, first verify after any number of steps, = 
{— i, 0 , i, 2 i, i + aj , 2 i + bj}, where integers a,b> 1 , is the 


set of all accessible states that are not absorbing. Moreover, 
from any state in SB, the next state is either another element 
of SB, or an absorbing state below —i. This can be showed 
using the fact that j > 3 i. 

For j £ {3i,2i,i}, since gcd(i,j) = 1 one can simply 
pick i = 1 and j = 3, 2, 1 respectively. This proves that the 
accessible states in sS are limited to {— i,0,i}. 

2) For 3 i > j > 2 i, let j = 2 i+z where 0 < z < v, and let 
* = kz + y where 0 < y < z (note that y = 0 when 2 = 1). 
Let 'S = {— i, 0, i , —z, —2 z, . . . , — kz, i — z,i — 2 z, . . . , i — 
kz}. 

The proof is done in 2 steps. First, verify that all states 
in 'S are accessible. This can be showed by considering the 
possible increments. Notice that from i one can access —z, 
and from —az one can access both i — az and — ( a + 1 )z 
for integers a = 1, . . . , k — 1. Finally, i — kz is accessible 
from — kz. Secondly, one can show that all accessible states 
are exclusively in 'S. This is showed by exploring possible 
next states starting from any state c £ ‘if. Eventually, either 
an absorbing state is reached, or one ends up at another state 
in 

3) For the first case when 2 i > j > i, let j = i + z 

where 0 < z < i. Since gcd[i,j) = 1, gcd(z,i) = 1 
too. The idea is that, since gcd(z, i) = 1, the following 
i integers have different remainders when divided by i: 
0, — z, — 2z, . . . , — (i — 1 )z. Thus, if we can find a set of 
such i states that are accessible, and such that all those i 
states are in [0, i), then this implies that those i states are 
exactly {0, 1, 1}. Moreover, it is obvious that state i 

accessible, thus all states in [0, i] are accessible. As a results, 
all states in [— i, 0] are also accessible by adding an increment 
of —i. 

For the second case when i > j, let i = kj + z where 
0 < z < j. Since gcd{i,j) = 1, gcd(z,j) = 1 too. The 
idea is similar to the first case, but using modulus j instead 
of modulus i. First, one needs to show that there exists an 
accessible set 2 of j states in [0, i] such that for d\ . d -2 £ 
and d\ ^ d 2 , we have d\ ^ (mod j). This says that the 
elements in @ , when divided by j, fill up all the possible 
remainders 0,1, ... ,j — 1. Next, one needs to show there 
exists paths from those j states of @ to j corresponding 
states in [0, j), which are exactly 0, 1, . . . , j — 1. Moreover, 
it is obvious that state j accessible, thus all states in [0, j] 
are accessible. Finally, one can always add/subtract integer 
multiples of j to access all states in [— i,i\. 


it follows that M n > e X2 p n for 1 < n < r, which yields: 


J2 M n> e _Aa Y,~P n = e ” Aa 

n = 1 n = 1 


'i -p T+1 

. 1 ~p 



Taking expectation on both sides (conditioned on h \ ). and 
using the super-martingale property (so that E[M„] < Mi) 
we have: 


E[r|/n] > e _A2 


1 

Mi 


1 - pE^prjhi] 

1-p 


(6) 


We next find find an upper bound for E[/5' r |(ii]. Let V n = 
P[r = n\h\] for n = 1,2,.... We can rewrite: 


E[p T |/ti] = ^ V = Vip + V 2 p 2 + V 3 p 3 + ^ V n p n , 

n— 1 n= 4 


where 

OO OO 

V ^P H ^P A Y, V " = P 4 ( I-P1-P2- V z ) - 

n = 4 n—4 

Substitute this back to (6) and (5) we have the given bound 
in the Proposition. 


B. Proof of Proposition 3 

Conditioning on hi, there are three possibilities: 1) Si = 
L gives hi = —l,Mi = e -A2 p; 2) Si = H, R\ = B gives 
hi = 1 - 1/x, Mi = e x Si-i/x)p. and 3) Si=H,Ri=G 
gives hi = 1 + 1/ x, Mi = e X2( ' 1+1 ^ x ^ p, and so we can write: 

EM = X>ME[r|/n] (5) 

hi 

Let p = 1/ max(/i(A), / 2 (A)) for regions where 0 < P < 
1, i.e. A £ (ln( j-^), 00). Since r is the first time h n < — 1, 



